PyDigger - unearthing stuff about Python


NameVersionSummarydate
pdf-searchable-ocr 0.1.1 A simple Python package for OCR with searchable PDF generation using PaddleOCR 2025-10-09 12:39:52
quanta-pdf 1.0.2 Advanced PDF layout analysis engine for extracting figures, tables, and structured content 2025-10-09 09:35:27
kiwi-pdf-chunker 0.3.3 A tool for parsing PDF document layouts and chunking content 2025-10-08 10:31:42
privision 1.0.1 视频内容脱敏工具 - 基于OCR的信息识别与打码系统 2025-10-07 17:32:48
doc2mark 0.4.1 Unified document processing with AI-powered OCR 2025-10-07 05:09:42
kreuzberg 3.15.0 Document intelligence framework for Python - Extract text, metadata, and structured data from diverse file formats 2025-09-14 18:14:57
sparrow-parse 1.1.3 Sparrow Parse is a Python package (part of Sparrow) for parsing and extracting information from documents. 2025-09-14 14:00:15
doctra 0.3.3 Parse, extract, and analyze documents with ease 2025-09-14 11:18:55
pdf2markdown 0.3.0 Python library and CLI tool that leverages LLMs to convert technical PDF documents to well-structured Markdown 2025-09-14 02:02:58
docstrange 1.1.6 Extract and Convert PDF, Word, PowerPoint, Excel, images, URLs into multiple formats (Markdown, JSON, CSV, HTML) with intelligent content extraction and advanced OCR. 2025-09-10 09:27:30
docling-onnx-models 0.1.3 ONNX Runtime implementations for Docling AI models 2025-09-09 08:45:47
mseep-kreuzberg 3.13.4 Document intelligence framework for Python - Extract text, metadata, and structured data from diverse file formats 2025-09-09 03:44:56
bot-vision-suite 1.1.1 Biblioteca Python avançada para automação de interface gráfica com OCR multi-técnica, detecção de imagens robusta e sistema de backtrack inteligente 2025-09-08 11:12:22
dedoc 2.5 Extract content and logical tree structure from textual documents 2025-09-08 03:25:51
mcp-pdf 1.0.1 Secure FastMCP server for comprehensive PDF processing - text extraction, OCR, table extraction, forms, annotations, and more 2025-09-07 07:00:52
mpxpy 0.0.18 Official Mathpix client for Python 2025-09-05 17:43:50
sem-meta 0.1.0 Unified interface for SEM image processing: metadata extraction, OCR-based pixel size estimation, and unit conversion 2025-09-05 15:12:45
marker-pdf 1.9.2 Convert documents to markdown with high speed and accuracy. 2025-09-04 18:45:56
docu-devs-api-client 1.0.8 A client library for accessing DocuDevs API 2025-09-04 15:25:43
docuglean-ocr 1.0.0 An SDK for intelligent document processing using SOTA VLLM models 2025-09-02 13:19:12
hourdayweektotal
8721658395326152
Elapsed time: 5.51576s